Skip to content

Limit repo size#21820

Draft
DmitryFrolovTri wants to merge 192 commits intogo-gitea:mainfrom
DmitryFrolovTri:limit-repo-size
Draft

Limit repo size#21820
DmitryFrolovTri wants to merge 192 commits intogo-gitea:mainfrom
DmitryFrolovTri:limit-repo-size

Conversation

@DmitryFrolovTri
Copy link
Copy Markdown
Contributor

@DmitryFrolovTri DmitryFrolovTri commented Nov 15, 2022

The goal of this PR is to define a repo limit size using "no-worsen" approach (allow things that do not increase or decrease the size on disk eventually) I think org and user level restriction could come later.
Thanks to @sapk as this is a continuation of his work that was started in #7833
Would address (except for LFS): #3658
Jan 22, 2026 Updated - no more DB limit. Only config and a run-time switch in UI for admin.

**Screenshots:** global limit setting for admin resets after restart should manage in config admin_settings_repository_screen

individual repo size limit for a repository admin (visible to a user who is owner of the repo)
repo_settings_screen

app.ini with settings

;; Specify a global repository Git size limit in bytes. -1 - Disabled, 0 - limit to zero bytes
;; Standard units of measurements for size can be used like B, KB, KiB, ... , EB, EiB, etc or if not provided - bytes
;; If limit is reached and operation doesn't increase disk consumption operation would be allowed
;; This is experimental and subject to change
;GIT_SIZE_MAX = -1

;; Specify a global repository LFS size limit in bytes. -1 - Disabled, 0 - limit to zero bytes
;; Standard units of measurements for size can be used like B, KB, KiB, ... , EB, EiB, etc or if not provided - bytes
;; If limit is reached and operation doesn't increase disk consumption operation would be allowed
;; This is experimental and subject to change
;LFS_SIZE_MAX = -1

how currently push is rejected if limit is reached
web_reject

TODO: (1)

  • Calculate Push Size (sapk had still have some corner-case to test mostly to not block deletion and some force push)
  • Edit max repo size
  • Enforce repo size
  • Fix tests
  • Add a global per repo limit in config for LFS and for GIT separately. - DEFAULT: -1 - not enabled. if 0 would be enabled and limit to 0B. (should accept bytes, K, and M, and G, may be T like 1000 - 1000 bytes, 1K - 1 kilobyte. 1M - 1 M megabyte.)
  • UI to enable/disable set global repository limit (Note. The setting get's reset every gitea restart to what it was in the config)
    Jan 22, 2026 Removed- [x] Add ability to have the feature on or off in the config (repository section in config), default - FALSE - A boolean parameter in config to enable the size-feature checking. If it is enabled then size limit is enforced. There is a field to edit size limit in the repo settings UI. The size limit is taken into account. If disabled then repo size limit is ignored. We can leave the size limit field in the repo settings UI and allow to edit it, however, the color of size limit value should be grayed. (telling that it is not working)
    the config parameter - ENABLE_SIZE_LIMIT = true/false

    - [x] Add ability to have a global repo limit per installation that is enforced unless an individual repo size limit present (repository section in config) - DEFAULT: 0 - global default size limit for any repo where individual size limit is not defined. if 0 - no limit if >0 limit is set so that if a repo has 0/undefined sizelimit this configuration limit will be used instead. Config parameter REPO_SIZE_LIMIT = XXXX (should accept bytes, K, and M, and G, may be T like 1000 - 1000 bytes, 1K - 1 kilobyte. 1M - 1 M megabyte.)

TOFIX: (2)

  • UI operations trigger 500 error when repo is over - present a correct message instead
Deletion of file from UI trigger 500 when repo is over. -> TODO catch this specific error. 2019/08/16 05:23:58 ...uters/repo/editor.go:432:DeleteFilePost() [E] DeleteRepoFile: git push: remote: Gitea: new repo size is over limitation 10000 To /home/sapk/go/src/code.gitea.io/gitea/data/repositories/sapk/test.git ! [remote rejected] d9629b41f9c58da756cf806aabf5811b1ff45b50 -> master (pre-receive hook declined) error: impossible de pousser des références vers '/home/sapk/go/src/code.gitea.io/gitea/data/repositories/sapk/test.git'
Creation of branch from UI trigger 500 when repo is over. -> TODO catch this specific error. 2019/08/16 05:28:42 ...uters/repo/branch.go:287:CreateBranch() [E] CreateNewBranch: Push: exit status 1 - remote: Gitea: new repo size is over limitation 10000 To /home/sapk/go/src/code.gitea.io/gitea/data/repositories/sapk/test.git ! [remote rejected] test -> test (pre-receive hook declined) error: impossible de pousser des références vers '/home/sapk/go/src/code.gitea.io/gitea/data/repositories/sapk/test.git'

Jan 22, 2026 Removed- [x] Add ability to count LFS Size towards Repo Size for those who store LFS on the same disk as the repo itself.

Note: If functionality to control size is enabled and a push triggers a size check (it's size would breach the limit) then push will be accepted only if total size of not referenced objects (removed size) is over or equal to total size of newly added objects in push. Or in other words pushes are accepted if they are under or at the limit after the operations they do OR if they are breaching the limit only if they don't grow disk space. Since git controls when not referenced objects are purged and it's not fast this condition could last for a while. Administrator of instance could speed it up via following steps:

  1. Done from the data folder of the specific repository on the gitea server <path_to_gitea_server_folder>/data/gitea-repositories/<user>/<repository>:
git reflog expire --expire-unreachable=all --all
git gc --prune=now
  1. Execute Git GC from the UI.
    This would free up all not referenced objects and update repository size in UI. On next push (if push size is smaller then limit) adding new objects will be allowed.
  2. There is a known bug in LFS - raised an issue The LFS maintenance scripts after orphaned objects removal do not update repo.LFSSize after operations. #36169 so that repo.LFSSize is actually updated after LFS garbage collect / LFS storage doctor. Now can't decrease the LFS size occupied at all. So if user reaches the limit he might get stuck.
  • Review the need to test doDeleteCommitAndPush type of test - Not Needed - Reverted
  • Prevent the upload as well in case it would breach the size limit on the repo. (server.go) here it can't be "no-worsen" it will fail if new size over the limit:
$ git lfs push origin main --all
batch response: LFS size 1.7 GiB would exceed limit 1.0 MiB
  • Enforce repo size with lfs added
  • adequate error messages to user upon lfs operation (if they are failed due to size)
  • add tests for lfs sizes
  • add lfs specific repo limit size configuration option and UI to edit

NEXT PR (4)

  • prevent migrating (mirror) repositories that would overflow the limit / Extend the size checking logic into the code for repository mirrors. We shouldn't mirror if that would breach the limit on repo/user/org level
  • Develop a go-git variant for size calculation
  • Add a test to confirm that already existing in the store objects do not count as new in both server.go and hook_pre_recieve.go
  • Not agreed yet. Update Git GC logic to allow for faster space release

EVEN further PR (5)

  • Add a per user/organisation account size limit that can be set by administrator (Global per repo used everywhere unless a per repo limit is set. If an action crosses the border of such account limit the action should be denied)
  • implement an organization/user level size restriction
  • Add a hard limit on repository size that would cancel any operation: (config option with percentage of overage, special message). No commit can be accepted in such case. This is because the limit is "soft" i.e. it does allow to increase space, but it would already prevent push that moves the repository over.

Related: #3658 #7833

@techknowlogick techknowlogick mentioned this pull request Nov 15, 2022
5 tasks
Comment thread modules/git/repo.go Outdated
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Nov 15, 2022
@lafriks
Copy link
Copy Markdown
Member

lafriks commented Nov 15, 2022

Does it count also LFS? (currently repo size is git repo size + LFS object size)

Imho we should finally split to have two repo sizes in db (git repo size and LFS object size) and both should have different limits settable.

@DmitryFrolovTri
Copy link
Copy Markdown
Contributor Author

@lafriks it does count the LFS now and the repo limit applies to both, which I think is correct.

@KN4CK3R
Copy link
Copy Markdown
Member

KN4CK3R commented Nov 15, 2022

LFS size is not counted. And LFS is tricky because we may end up with LFS files uploaded and the pointer push is denied because of the limit or (worse) the pointer push is allowed and the file upload fails afterwards because of the limit.

@DmitryFrolovTri
Copy link
Copy Markdown
Contributor Author

Ok will look into this further. Thank you!

@lafriks
Copy link
Copy Markdown
Member

lafriks commented Nov 15, 2022

Yes that's why my suggestion was for those to be different limits and splitting also how sizes are saved in database for repo

@lunny lunny added the type/enhancement An improvement of existing functionality label Nov 16, 2022
@DmitryFrolovTri
Copy link
Copy Markdown
Contributor Author

Hi @techknowlogick @kdumontnu I have fixed the opencollective link to make it easy to donate. I've moved the funds I was able to gather 5$ - there :) Here is the new link for this activity: https://opencollective.com/oss-code-ge/projects/gitea-limit-repository-size

@morevnaproject
Copy link
Copy Markdown

Hi @DmitryFrolovTri
We are interested in this feature and contributed to your campaign.
Will spread a word in our social medias. Keep it up! 💪

@techknowlogick
Copy link
Copy Markdown
Member

@DmitryFrolovTri I've heard back from OC, and because you are not using them as a fiscal sponsor that's why I couldn't make the transfer. So I will just mention it here that upon completion of this PR/issue we will pay out $500 from our collective for the bounty (this amount was chosen to limit tax burden to whoever gets paid out), and @sapk for your work so far, if you are interested, we can pay you for your work so far (reach out to me via email and I can get you sorted).

@DmitryFrolovTri
Copy link
Copy Markdown
Contributor Author

@techknowlogick I've since then re-registered and the link above is with the opencollective (OC host) now.

@DmitryFrolovTri
Copy link
Copy Markdown
Contributor Author

DmitryFrolovTri commented Jan 28, 2026

@DmitryFrolovTri you can just remove all empty options/locale/locale_*.json, they don't exist on main branch.

Carefully merged everything in this folder from upstream/main removing anything I had with exception to one file I did modify. Should be correct @silverwind

@silverwind
Copy link
Copy Markdown
Member

@lafriks consider unblocking.

Comment thread services/lfs/server.go
func traceBatchDecision(rc *requestContext, op, msg string, args ...any) {
prefix := fmt.Sprintf("LFS[BATCH][%s/%s][op=%s] ", rc.User, rc.Repo, op)
log.Trace(prefix+msg, args...)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why a log-only function?

Comment on lines +144 to +161
if ctx.Data["Err_Repo_Size_Limit"] != nil {
ctx.RenderWithErr(ctx.Tr("admin.config.invalid_repo_size", ctx.Data["Err_Repo_Size_Limit"]),
opts.TplName, nil)
return
}

if ctx.Data["Err_LFS_Size_Limit"] != nil {
ctx.RenderWithErr(ctx.Tr("admin.config.invalid_lfs_size", ctx.Data["Err_LFS_Size_Limit"]),
opts.TplName, nil)
return
}

if ctx.Data["Err_Repo_Size_Save"] != nil {
ctx.RenderWithErr(ctx.Tr("admin.config.save_repo_size_setting_failed", ctx.Data["Err_Repo_Size_Save"]),
opts.TplName, nil)
return
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why in RenderRepoSearch, and these are "admin" related?

Comment on lines +57 to +67
gitSizeMax, err := setting.ParseRepositorySizeLimit(form.GitSizeMax)
if err != nil {
ctx.Data["Err_Git_Size_Max"] = form.GitSizeMax
explore.RenderRepoSearch(ctx, &explore.RepoSearchOptions{
Private: true,
PageSize: setting.UI.Admin.RepoPagingNum,
TplName: tplRepos,
OnlyShowRelevant: false,
})
return
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use form-fetch-action or link-action, and JSON response.

Comment on lines +376 to +380
var (
lfsPointerMarker = []byte("version https://git-lfs.github.com/spec/v1")
lfsOIDRe = regexp.MustCompile(`(?m)^oid sha256:([0-9a-f]{64})$`)
lfsSizeRe = regexp.MustCompile(`(?m)^size ([0-9]+)$`)
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should already be functions like ReadPointerFromBuffer

Comment on lines +262 to +263
// This is the number of workers that will simultaneously process CalculateSizeOfObject.
const numWorkers = 10
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't think it's right to start 10 git processes in one push action.

for _, packFile := range packFiles {
log.Trace("Processing packfile %s", packFile)
// Extract and store in cache objectsSizes the sizes of the object parsing output of the `git verify-pack` command
output, _, err := gitcmd.NewCommand("verify-pack", "-v").AddDynamicArguments(packFile).WithDir(dir).WithEnv(env).RunStdString(ctx)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be very slow on large repo?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is extreamely slow, it will shutdown the whole Gitea instance.

~/work/gitea/.git/objects/pack$ time git verify-pack pack-7b1d851cf9f2f68ab601a8c44194c03332818497.pack
git verify-pack pack-7b1d851cf9f2f68ab601a8c44194c03332818497.pack  20.47s user 0.84s system 162% cpu 13.093 total
~/work/gitea/.git/objects/pack$

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering this is very slow, probably because of git has to unpack the the pack file in pre-receive (and doing it again after exit 0?) my naive thought would be lossly checks based on filesize(pack-7b1d851cf9f2f68ab601a8c44194c03332818497.pack). Calculating filesize is cheep, unpacking a large pack file is expensive.

A feasibility check of my random thoughts about disk size based limits would have to be done first...

Copy link
Copy Markdown
Member

@silverwind silverwind Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't checking the size of the quarantine area alone enough? I've prompted gemini with "what is the fastest method to estimate the new size of a git repo in a pre-receive hook" and it says:

image image image image

@wxiaoguang wxiaoguang marked this pull request as draft January 30, 2026 09:09

This comment was marked as off-topic.

Copy link
Copy Markdown
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the implementation can really work in production

@go-gitea/maintainers please help to improve.

@wxiaoguang
Copy link
Copy Markdown
Contributor

@silverwind : although there are AI tools, please make sure the code and design are overall right , before asking AI to review or approve.

@silverwind
Copy link
Copy Markdown
Member

Yeah I know, AI mostly just catches surface-level issues, not design issues.

@lunny
Copy link
Copy Markdown
Member

lunny commented Jan 30, 2026

I think the abstraction level in this PR is too low, which may make it difficult to extend in the future.

@DmitryFrolovTri
Copy link
Copy Markdown
Contributor Author

DmitryFrolovTri commented Feb 10, 2026

Well this means we are stuck here somewhat. @lunny @wxiaoguang could you steer a bit then?
Shall we do another PR similair to forgejo? with some hierarchichal limits?
Any direction where the abstraction should be improved?
So far I got following main comments:

  1. abstraction is not on the level required
  2. the calculations are intensive (this I could fix) and revert to disk checking (which would not allow the user to reduce the size of the repo himself and would require admin to do something. Disk checking doesn't allow to estimate reduction in size that this PR brings therefore we won't be able to accept PRs that are deletions or contain no changes) if we are ok with that the this is a way to do.

but for 1) need some idea - mine is to drop this PR and do something forgejo style.

Shall we go the direction of:
for 1) do forgejo style
for 2) abandon acepting PR brining reductions/deletions?
or some other abstraction level improvement?

@wxiaoguang
Copy link
Copy Markdown
Contributor

I don't know, I don't need this feature and haven't really looked into details. I was just pointing out some essential design problems.

@silverwind
Copy link
Copy Markdown
Member

silverwind commented Feb 18, 2026

@DmitryFrolovTri I think I could exercise Claude on this. Are you ok if I push fixes for all remaining discussions here? If you have a qualified agent available yourself, you can do it yourself too.

@wxiaoguang
Copy link
Copy Markdown
Contributor

@DmitryFrolovTri I think I could exercise Claude on this. Are you ok if I push fixes for all remaining discussions here? If you have a qualified agent available yourself, you can do it yourself too.

For this complex task, you need to make sure you fully understand what you are doing, but not just blindly trust AI.

@DmitryFrolovTri
Copy link
Copy Markdown
Contributor Author

@DmitryFrolovTri I think I could exercise Claude on this. Are you ok if I push fixes for all remaining discussions here? If you have a qualified agent available yourself, you can do it yourself too.

For this complex task, you need to make sure you fully understand what you are doing, but not just blindly trust AI.

Fully agree to this

@bircni
Copy link
Copy Markdown
Member

bircni commented Apr 22, 2026

@silverwind lets reenable

@DaanSelen
Copy link
Copy Markdown
Contributor

@silverwind lets reenable

Good to see

@silverwind silverwind self-requested a review April 22, 2026 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-update-needed The document needs to be updated synchronously lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged type/docs This PR mainly updates/creates documentation type/feature Completely new functionality. Can only be merged if feature freeze is not active.

Projects

None yet

Development

Successfully merging this pull request may close these issues.